68 research outputs found
Privacy Against Statistical Inference
We propose a general statistical inference framework to capture the privacy
threat incurred by a user that releases data to a passive but curious
adversary, given utility constraints. We show that applying this general
framework to the setting where the adversary uses the self-information cost
function naturally leads to a non-asymptotic information-theoretic approach for
characterizing the best achievable privacy subject to utility constraints.
Based on these results we introduce two privacy metrics, namely average
information leakage and maximum information leakage. We prove that under both
metrics the resulting design problem of finding the optimal mapping from the
user's data to a privacy-preserving output can be cast as a modified
rate-distortion problem which, in turn, can be formulated as a convex program.
Finally, we compare our framework with differential privacy.Comment: Allerton 2012, 8 page
An Exploration of the Role of Principal Inertia Components in Information Theory
The principal inertia components of the joint distribution of two random
variables and are inherently connected to how an observation of is
statistically related to a hidden variable . In this paper, we explore this
connection within an information theoretic framework. We show that, under
certain symmetry conditions, the principal inertia components play an important
role in estimating one-bit functions of , namely , given an
observation of . In particular, the principal inertia components bear an
interpretation as filter coefficients in the linear transformation of
into . This interpretation naturally leads to the
conjecture that the mutual information between and is maximized when
all the principal inertia components have equal value. We also study the role
of the principal inertia components in the Markov chain , where and are binary
random variables. We illustrate our results for the setting where and
are binary strings and is the result of sending through an additive
noise binary channel.Comment: Submitted to the 2014 IEEE Information Theory Workshop (ITW
Bottleneck Problems: Information and Estimation-Theoretic View
Information bottleneck (IB) and privacy funnel (PF) are two closely related
optimization problems which have found applications in machine learning, design
of privacy algorithms, capacity problems (e.g., Mrs. Gerber's Lemma), strong
data processing inequalities, among others. In this work, we first investigate
the functional properties of IB and PF through a unified theoretical framework.
We then connect them to three information-theoretic coding problems, namely
hypothesis testing against independence, noisy source coding and dependence
dilution. Leveraging these connections, we prove a new cardinality bound for
the auxiliary variable in IB, making its computation more tractable for
discrete random variables.
In the second part, we introduce a general family of optimization problems,
termed as \textit{bottleneck problems}, by replacing mutual information in IB
and PF with other notions of mutual information, namely -information and
Arimoto's mutual information. We then argue that, unlike IB and PF, these
problems lead to easily interpretable guarantee in a variety of inference tasks
with statistical constraints on accuracy and privacy. Although the underlying
optimization problems are non-convex, we develop a technique to evaluate
bottleneck problems in closed form by equivalently expressing them in terms of
lower convex or upper concave envelope of certain functions. By applying this
technique to binary case, we derive closed form expressions for several
bottleneck problems
- …